首页> 外文OA文献 >Data locality and parallelism optimization using a constraint-based approach
【2h】

Data locality and parallelism optimization using a constraint-based approach

机译:使用基于约束的方法进行数据局部性和并行性优化

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Embedded applications are becoming increasingly complex and processing ever-increasing datasets. In the context of data-intensive embedded applications, there have been two complementary approaches to enhancing application behavior, namely, data locality optimizations and improving loop-level parallelism. Data locality needs to be enhanced to maximize the number of data accesses satisfied from the higher levels of the memory hierarchy. On the other hand, compiler-based code parallelization schemes require a fresh look for chip multiprocessors as interprocessor communication is much cheaper than off-chip memory accesses. Therefore, a compiler needs to minimize the number of off-chip memory accesses. This can be achieved by considering multiple loop nests simultaneously. Although compilers address these two problems, there is an inherent difficulty in optimizing both data locality and parallelism simultaneously. Therefore, an integrated approach that combines these two can generate much better results than each individual approach. Based on these observations, this paper proposes a constraint network (CN)-based formulation for data locality optimization and code parallelization. The paper also presents experimental evidence, demonstrating the success of the proposed approach, and compares our results with those obtained through previously proposed approaches. The experiments from our implementation indicate that the proposed approach is very effective in enhancing data locality and parallelization. © 2010 Elsevier Inc. All rights reserved.
机译:嵌入式应用程序变得越来越复杂,并且处理的数据集也越来越多。在数据密集型嵌入式应用程序的上下文中,有两种互补的方法可以增强应用程序的行为,即数据局部性优化和改善循环级并行性。需要增强数据局部性,以使更高级别的内存层次结构所满足的数据访问次数达到最大。另一方面,基于编译器的代码并行化方案要求芯片多处理器具有崭新的外观,因为处理器间通信比片外存储器访问便宜得多。因此,编译器需要最小化片外存储器访问次数。这可以通过同时考虑多个循环嵌套来实现。尽管编译器解决了这两个问题,但是在同时优化数据局部性和并行性方面存在固有的困难。因此,将这两种方法结合起来的集成方法所产生的结果要比每种方法都好得多。基于这些观察,本文提出了一种基于约束网络(CN)的公式,用于数据局部性优化和代码并行化。本文还提供了实验证据,证明了所提出方法的成功,并将我们的结果与通过先前提出的方法获得的结果进行了比较。从我们的实现中进行的实验表明,该方法在增强数据局部性和并行化方面非常有效。 ©2010 Elsevier Inc.保留所有权利。

著录项

  • 作者

    Ozturk O.;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 English
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号